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Abstract 

The  zero-sum,  perfect  information  pursuit-evasion  dif¬ 
ferential  game  is  reviewed.  The  purpose  of  this  thesis 
is  to  formulate  a  method  for  generating  near- optimal 
closed-loop  solutions  to  these  problems.  The  method  is 
then  applied  to  a  number  of  example  problems  in  order  to 
check  its  validity.  This  method  deals  with  solutions  in 
the  small  and  is  based  on  updating  the  two-point  boundary- 
value  problem  by  use  of  the  neighboring  extremal  path  con¬ 
cept. 

The  two  differential  game  problems  examined  are  a 
simple  motion  problem  and  a  rocket  problem.  Two  separate 
cases  were  studied  for  each  problem.  One  was  the  fixed 
final  time  problem  and  the  other  was  the  free  final  time 
with  a  terminal  constraint. 

Analysis  of  the  results  obtained,  supports  the  feas¬ 
ibility  of  this  method  to  provide  near-optimal  closed-loop 
solutions  to  differential  game  problems. 
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A  METHOD  FOR  GENERATING  CLOSED-LOOP 
SOLUTIONS  TO  DIFFERENTIAL  GAMES 

I.  Introduction 

The  original  formulation  of  differential  games  » ns 
presented  by  Isaacs  (Ref  1)  less  than  ts  five  ye.  rs 
ago.  It  was  developed  front  the  theory  of  games  and  deals 
with  games  in  which  two  opposing  players  are  "confrcn  cd 
with  lengthy  sequences--be  they  continuous  or  discrete-- 
of  decisions  which  are  knit  together  logically  so  that  a 
perceptible  and  calculable  pattern  prevails  throughout" 

(Ref  1:3).;  Thus  it  soon  found  applications  in  economics, 
in  the  development  of  control  systems  and  in  analyzing  war¬ 
fare  problems.  The  latter  will  be  the  area  of  considera¬ 
tion  in  this  thesis. 

he  will  consider  games  involving  two  players,  a  pursuer 
and  an  evader,  each  with  conflicting  interests  and  each  with 
complete  information  about  his  opponent’s  state  and  avail¬ 
able  strategies.  Ideally  the  evader  wants  to  select  his 
strategy  based  on  the  present  state  of  the  game  that  will 
maximize  a  certain  quantity  called  the  "cost"  or  "payoff". 
The  cost  may  be  any  number  of  things  such  as  the  time  to 
capture,  or  the  distance  between  players,  or  the  fuel  re¬ 
quired  by  the  pursuer,  etc...  .  The  pursuer  at  the  same 
time  wants  to  select  his  strategy  based  on  the  present  state 
of  the  game  that  will  minimize  the  cost.  The  strategies 
supply  instructions  as  to  how  to  set  the  controls  for  each 
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set  of  data  measured.  If  the  control  variables  are  func¬ 
tions  of  the  state  variables  and  time,  we  have  a  closed- 
loop  solution  to  the  problem.-  In  this  case,  if  either 
player  plays  non-optimally ,  the  other  player  if  playing 
optimally  wi.  1  immediately  take  advantage  of  this  and  gain 
what  the  first  player  loses.  This  is  known  as  the  zero-sum 
feature  of  differential  games.. 

Although  it  is  evident  that  the  closed- loop  solution  is 
the  desired  solution  in  all  differential  games,  often  there 
is  no  practical  means  of  obtaining  it.  In  most  problems 
the  costate  or  adjoint  differential  equations  which  must  he 
integrated  to  provide  the  link  between  the  controls  and 
state  variables  are  nonlinear,  nonhomogeneous  equations 
that  can  not  be  solved  other  than  numerically.;  Therefore 
we  are  left  with  optimal  control  strategies  which  are  func¬ 
tions  of  time  and  the  initial  conditions  cf  the  problem. 
These  are  called  open- loop  control  laws.  Basically  we  have 
a  two-point  boundary-value  problem  (TPBVP)  with  the  initial 
states  and  final  costates  given.  If  solved,  it  provides 
an  optimal  open-loop  trajectory  which  for  zero-sum  games 
is  the  same  as  the  closed-loop  trajectory  if  both  players 
play  optimally.-  This  is  not  true  if  either  player  deviates 
from  his  optimal  strategy. 

The  purpose  of  this  thesis  is  to  devise  a  method  for 
obtaining  near  optimal  closed-loop  solutions  to  differential 
games  and  to  apply  the  method  to  a  number  of  problems  to 
demonstrate  its  validity.  The  method  is  developed  and  an 
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algorithm  presented  in  Chapter  II.  This  methou  is  based 
on  the  assumption  that  the  TPBVP  can  be  solved  to  provide 
a  needed  reference  optimal  open-loop  trajectory.  Also, 
the  solutions  considered  in  this  thesis-  are  solutions  in  the 
small.  That  is,  they  refer  to  the  smooth  parts  of  the  so¬ 
lution  found  between  the  singular  surfaces  that  separate 
the  number  of  parts  of  the 'playing  space. 

Fixed  final  time  problems  are  examined  in  Chapters  III 
and  IV,  while  free  final  time  problems  with  terminal  con¬ 
straint  are  examined  in  Chapters  V  and  VI.- 
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II.  Statement  of  the  Froblem 


Differential  Game  Problem 

The  zero-sum  perfect  information  pursuit-evasion  dif¬ 
ferential  game  setup  will  be  represented  by  the  following 
dynamic  system  (Ref  2:277) 


x  =  f(x,u,v',t),  x(t0)  =  x0 


where  x  is  an  n-dimensional  state  vector,  u  is  an  m-dimen- 
sional  decision  (or  control)  vector  for  the  pursuer,  v  is 
a  p-dimensional  control  vector  for  the  evader  and  t0  repre¬ 
sents  the  initial  time.  The  controls  may  or  may  not  be 
subject  to  constraints  depending  on  the  problem  being  con¬ 
sidered.  The  terminal  constraints  (conditions  which  must 
be  satisfied  when  the  game  is  over)  are 


«[^(t£),tf]  =  0 


where  i|>  is  a  q  vector,  and  the  performance  criterion  (cost 
or  payoff)  is 


J  =  $[x(tf),tf]  +  /  L(x,u,v,t)dt 


The  object  is  to  find  u*  and  v*  such  that 


J(u*,v)  <  J(u*,v*)  <  J(u,v*) 


If  u*  and  v*  can  be  found,  the  pair  (u*,v*J  is  called  a 
saddle  point  of  the  game  and  J(u*,v*)  is  called  the  value 
of  the  game. 
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Necessary  Conditions.  In  order  to  apply  the  necessary 
conditions  /or  a  saddle  point  solution  in  the  small,  the 
Hamiltonian  (H)  is  defined  as 

iUt.x.i.u.v)  =  ATf  +  L  (2-5) 

where  a.  is  an  n-dimensional  costate  vector.  This  scalar 
function  H  must  be  minimized  over  the  set  of  admissible  u 
and  maximized  over  the  set  of  admissible  v  in  order  to 
have  a  saddle  point  solution  of  the  above  differential 
game,  that  is 

H*  =  M“x  Mjn  H  -  Mjn  M$x  H  (2-6) 

and  the  second-order  necessary  conditions  are  that 

H5u  ^  0  ■  ,fvv  i  0  (2-7) 

As  mentioned  in  Chapter  I,  singular  arcs  will  not  be 
considered  in  this  thesis.  In  some  optimization  problems, 
extremal  arcs  (Hu  =  0)  occur  on  which  the  matrix  Iluu  is 
singular.  Such  arcs  are  called  singular  arcs. 

Th?  costate  differential  equations  are 

A1  =  -sli/Bx  =  -Hx  (2-8) 

and  the  transversality  conditions  are  given  by 

H(tf)  =  -<>t(tf);  XT(tf)  =  4x(tp)  (2-9) 

where  $ fx(tp)  ,  tf]  =  j  +  and  v*  is  a  constant  bagrange 
multiplier. 
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Open  Vs.  Closed-Loop  Controls.  As  previously  stated, 
the  optimal  solution  to  the  differential  game  problem  is 
the  pair  of  controls  (u*>v*)  which  provides  a  saddle  point 
of  J.  Attention  must  be  given  to  the  interpretation  of  l* 
and  v*  in  Eq  (2-4)  as  open-loop  or  closed-loop  strategies. 
If  the  pair  (u*,v*)  is  a  function  of  -.ime  and  the  initial 
conditions  (i.e.  u*(t,x0,t0)  and  v*(t  ,x0,t0)) ,  one  speaks 
of  an  open-loop  solution.  If  the  controls  are  expressed 
as  functions  of  it,<.  instantaneous  state  and  time 


u*  =  ku(x,t) 
v*  =  kv(x,t) 


(2-10) 


one  has  what  is  known  as  a  feedback  or  closed- loop  control 
law.  The  closed-loop  control  law  is  a  much  more  stngent 
type  of  optimality.  It  means  that  the  evader  must  play 
optimally  against  an  opponent  whose  control  is  produced  in 
a  feedback  fashion;  that  is,  the  pursuer  can  immediately 
take  advantage  of  any  nonoptimal  play  made  by  the  evader.. 

If  both  players  play  their  optimal  strategy,  the  open- 
loop  and  closed- loop  solutions  are  the  same  for  zero-sum 
games.  But  if  either  player  deviates  from  his  optimal 
play,  the  open-loop  solution  will  differ  from  the  actual 
or  closed-loop  solution  and  this  difference  could  result  in 
a  complete  change  in  the  outcome  of  the  game.  Therefore 
we  would  like  to  generate  closed-loop  control  laws.  This 
can  be  done  if  the  solution  to  the  two-point  boundary- value 
problem  (TPBVP)  can  be  continuou'-ly  updated  based  on  current 
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states.  In  order  to  accomplish  this,  we  must  find  the 
effect  on  the  costates  due  *o  small  changes  in  the  states;' 
that  is,  wc  mist  find  £X(t)  as  a  function  of  £x(t)  which 
can  be  obtained  from  a  neighboring  extremal  path  approach. 

Neighboring  Lxtremal  Paths 

This  will  be  an  extension  of  Bryson  and  llo’s  develop¬ 
ment  (Ref  2:177)  to  differential  games  As  was  mentioned 
before,  if  both  players  play  their  optimal  strategy,  the 
open  and  closed-loop  solutions  are  the  same  for  zero-sum 
games.  Let  us  suppose  that  wc  have  determined  a  reference 
trajectory  by  solving  the  TPBVP  in  the  small  and  have  been 
given  the  initial  conditions  for  the  states. 

First,  for  problems  in  which  the  final  time  is  speci¬ 
fied,  if  we  consider  small  perturbations  from  this  reference 
extremal  path  produced  by  small  perturbations  in  the  initial 
state  5x(t0)  and  in  the  terminal  conditions  £t!>,  we  expect 
that  such  perturbations  will  give  rise  to  perturbations 
6x(t),  £ X (t)  ,  «u(t),  dv  governed  by  linearizing  llqs  (2-1), 
(2-2),  (2-6),  (2-S)  and  (2-9)  around  the  extremal  path.  Wc 


therefore  can  obtain  the  following  equations:' 

Sx  =  A(t)6x  -  B ( t ) 6 X ,  £x(t0)  specified  (2-11) 

6 X  =  -C(t)Sx  -  AT(t)«X  (2-12) 

aii/au  =  o,  Dii/av  =  o  (2-13) 

£X(tf)  =  [(Cxx+vl  xx)Sx+i'xdv]t  =  tf  (2-14) 

£  <’  =  [i''£x]t  =  t  £■  (2-15) 
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where 

A(t)  =■  fx-fu  H„u  Hux 

B(t)  =  fu  Hui 

C(t)  =  HXXHXU  H-l  Hux 

which  are  (nxn)  matrices.  These  equations  represent  a 
linear  two-point  boundary- value  problen  since  the  coeffic¬ 
ients  are  evaluated  on  the  extremal  path. 

By  using  the  backward  sweep  methoi  (Ref  1:179)  for 
determining  the  neighboring  extremal  path  we  arrive  at  the 
following  matrix  differential  equations 


5  =  -SA  -  ATS  +  SBS  -  C  (2-16) 

R  =  -  (AT  -  SB)R  (2-17) 

Q  =  RtBR  •:  (2-18) 

and  boundary  conditions 

s (tf )  =  I*xx  -T*xx]t  .  tf  (2-19) 

R(tf)  =  f^]t  =  tf  (2-20) 

Q(tf)  =  0  (2-20) 

where 

6  A  (t)  =  S(t)6x(t)  +  R(t)  dv  (2-22) 

=  RT(t)6x(t)  +  Q(t)  dv  (2-23) 


If  these  matrix  differential  equations  are  integrated  back¬ 
wards  from  t  =  tf,  the  relations  (2-21)  and  (2-22)  represent 
boundary  conditions  equivalent  to  the  terminal  boundary 
conditions  (2-13)  and  (2-14)  at  earlier  times;  thus,  we  are 
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"sweeping"  the  terminal  boundary  conditions  backward  to 
earlier  times.  This  allows  us  to  eventually  determine  Sx 
and  Si  at  this  earlier  time. 

Now,  if  we  consider  the  case  where  the  final  time  is 
unspecified,  the  nominal  optimum  solution  must  satisfy  the 
additional  necessary  condition 

fl(x,v,v,t)  |t  =  t£  h  {|i  ♦  L)  t  _  tf  (2-24) 

where 


*  =  if(X,t)+v'1l(i(X,t)  , 


d$  .  a  $  f  a »  . 
<It  “  at  ax  x 


(2-2S) 


This  scalar  equation  determines  the  additional  unknown 
parameter,  tp. 

Perturbation  of  the  necessary  cortditions  (2-2),  (2-9), 
and  (2-23)  must  take  into  account  the  perturbations  in  the 
final  time,  dtp.  Finally  this  leads  to 


"4  i  (tf)' 

r  32$ 

3x2  ’ 

it; 

•  (i if- 

*6  X  ( t  £  )  ’ 

(2-26) 

do 

= 

3 

3X  * 

0 

do 
>  dt 

dv 

(2-27) 

ft 

A iAT 

dSi 

u 

-  3x  » 

W 

’  dt  _ 

**£  . 

(2-28) 

t  =  tf 


where 

dsi  a  si  a  q  di;  a  o  a  o 

eft  =  3t  +  3x  J  >  dt=3t+3xf  (2-29) 


o 
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Equations  (2-11)  through  (2-13)  plus  (2-26)  through 
(2-28)  represent  a  linear  two-point  boundary-value  problem 
for  a  neighboring  extremal  with  small  changes  in  initial 
conditions,  Sx(t0),  and/or  small  changes  in  the  terminal 
conditions,  d$.  These  changes  will,  in  general,  produce 
small  changes  6x( tf) ,  dv  and  dtf . 

As  in  the  previous  case,  where  the  final  time  was 
fixed,  we  may  extend  the  backward  sweep  method  to  solve  the 
unspecified  final  time  problem.  By  using  the  substitution 


s\(t) 

S(t) 

,  R(t) 

,  m(t) 

5x(t) 

(2-30) 

difi 

= 

RT(t) 

,  Q(t) 

.  «(t) 

dv 

(2-31) 

da 

mT(t) 

,  nT(t) 

,  a  (t) 

dtf 

(2-32) 

and  after  differentiating  them  and  making  use  of  the  per¬ 
turbation  equations,  we  obtain  the  following  differential 
equations  and  boundary  conditions 


$  =  -sa-ats+sbs-c 

,  S(tf) 

'  ?  \ 

1,. 

tf 

(2-33) 

R  =  -(AT-SB)R 

,  R(tf)  =| 

'  \ 
„  »*  / 

T 

t  = 

tf 

(2-34) 

Q  =  RTBA 

,  Q(tf)  = 

0 

(2-35) 

m  =  -(A^-SB)m 

,  m(tf)  = 

(za  ^ 

\»l 

1 

t  = 

tf 

(2-36) 

n  =  RTBm 

,  n(tf)  =  i 

(#) 

t  = 

tf 

(2-37) 

d  =  rn^Bm 

»  o (tf)  = 

(*B.\ 

\  dt  / 1  = 

tf 

(2-38) 
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Equations  (2-16)  and  (2-33)  are  identical  and  they 
are  known  as  the  matrix  Riccati  equation..  If  equations 
(2-33)  through  (2-38)  are  integrated  backwards  from  tf  to 


tf,  we  can  use  (2-31)  and  (2-32),  evaluated  at  t  =  tj,  to 
determine  dv  and  dt£  in  terms  of  $x(tf)  and  di>,  as  follows 

dv  =  [jT1(d»-irT«x)]t  ,  t  (2-39) 

dtf  =  - r (si  -  £  +  si  (2-40 

A  a  a  ^  °  t=t£ 

where 

5  5  Q  -  SSI  (2-41) 

K  =  R  -  22I  (2-42) 


Since  we  have  d v  and  dtf  from  (2-39)  and  (2-40),  6X(tf)  can 
be  determined  from  (2-30): 

Si(ti)  =  [(S-S  Q’1^]  _  (2-43) 

t  “  L 1 

where 


Neighboring  Extremal  Algorithm 

The  method  used  to  solve  the  above  problem  is  depicted 
in  Fig  1,  page  12,  and  explained  below. 

(a)  Given  the  initial  conditions,  we  assume  we  can 
solve  the  Tl’BVP.:  If  both  players  play  their  optimal  strategy, 
integrate  the  "reference"  state  and  costate  differential 
equations  forward  from  t0  to  tf  and  get  the  reference  optimal 
open-loop  trajectory..  This  is  represented  by  the  curve 


tn  *-3  >  M  CO 
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Figure  1 Neighboring  Extremal  Path 


2 


GA/MC/72-4 


o 


e 


o, 

y 


between  points  1  and  3.  But  if  the  evader  plays  a  non- 
optimal  strategy,  the  "real"  trajectory  (from  point  1  to 
4)  can  be  obtained  by  substituting  ihis  strategy  into  the 
state  and  costate  equations  and  integrating  forward  from 
t0.  Stop  th a  integration  at  tf  after  an  elapsed  time  of 
At. 

(b)  Having  determined’  the  final  reference  states  at 
tf,  enforce  the  transversality  conditions  (Eq  2-9)  and 
boundary  conditions  (for  fixed  final  time  problems  Eqs 
2-19  through  2-21;  for  unspecified  final  time  problems 
Eqs  2-33  through  2-38)  for  the  matrix  Riccati  equations. 

(c)  Using  the  boundary  conditions  from  step  (b) , 
integrate  the  matrix  Riccati  differential  equations  back¬ 
wards  along  the  reference  trajectory  from  tf  to  tf  (point 
2).  This  gives  the  values  oi  S(tf)  ,  R(tf ) ,  Q(tf),  ... 
which  will  be  needed  later  to  compute  6A(tf). 

(d)  Compute  the  difference  between  the  real  (point  4) 
and  reference  (point  2)  states  at  tf. 

,xttl>  ‘  &P  -  Ref!1  ,  (2'*51 

(e)  Using  Eq  (2-19)  for  the  fixed  time  problem  (or 
Eq  2-30  for  the  unspecified  final  time  problem)  and  the 
results  of  step  (c) ,  compute  6A(tf)  and  update  the  costates 
at  tf  by 

*(*!>„«,  -  *1*11  old  +  5Xftll  ^2'46) 


13 


GA/MC/72-4 


(f)  Now  at  time  tj,  you  have  a  "new"  or  updated  TPBVP 
with  the  initial  conditions  for  the  states  being  the  real 
values  at  point  4  and  the  initial  conditions  for  the  co¬ 
states  being  the  values  computed  in  step  (e) ., 

(g)  Repeat  steps  Ca)  through  (f)  until  you  reach  tp 
for  the  fixed  final  time  problems  or  until  the  terminal  con 
straint  is  satisfied  for  the  unspecified  final  time  prob¬ 
lems.  We  are  assuming  that  the  linearized  differential 
equations  of  the  problems  are  valid  for  small  perturbations 
from  the  optimal  extremal  path.;  If  this  were  not  a  valid 
assumption  the  solution  would  bo  expected  to  diverge.  If 
it  does  not  diverge,  the  assumption  should  be  valid. 
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III.  Pursuit-Evasion  Differential  Game-Simple 
Motion  Problem.  Fixed  Final  Time 

Statement  of  the  Problem 

This  is  a  two-dimensional  problem  as  depicted  in  Fig 
2,  page  16.  The  pursuer  and  evader  have  no  restrictions  on 
their  direction  or  motion,  that  is  they  can  change  direction 
instantaneously  over  the  complete  360°  circle  around  them. 
There  are  no  terminal  constraints  on  the  problem.  We  will 
start  at  some  time  t0  and  run  until  a  fixed  time  which  will 
be  called  tf.  The  magnitude  of  the  velocity  of  each  player 
is  constant,  but  the  pursuer  has  a  speed  advantage.  The 
cost  or  payoff  will  be  one-half  the  square  of  the  distance 
between  the  players  at  t£.  In  other  words,  we  want  to 
determine  the  saddle  point  of 

J(tf)  =  |(x2+y2)'lt  =  tf  (3-1) 

subject  to  the  following  differential  equations  of  motion  in 
a  relative  coordinate  system  with  the  origin  at  the  pursuer. 

x  =  Ve  cos  v  -  Vp  cos  u  (3-2) 

y  =  Ve  sin  v  -  Vp  sin  u  f 3-3) 

The  subscripts  p  and  e  refer  to  the  pursuer  and  the  evader 
respectively.  The  pursuer's  control  is  u  and  the  evader's 
control  is  v,  which  is  rheir  respective  direction  of  motion. 


IS 


Figure  2.  Simple  Motion  Two-Dimensional  Problem 
Necessary  Conditions 

Applying  the  necessary  conditions  for  a  saddle  point 
solution,  as  explained  in  Chapter  II,  the  Hamiltonian  H  is 
given  by 

H  =  Xx[Vecos  v-VpCos  u]  +  Xy[Vesin  v-Vpsin  u]  (3-4) 

The  Hamiltonian  is  to  be  minimized,  with  respect  to  the 
pursuer's  control,  and  maximized  with  respect  to  the  evader's 
control.  To  do  this,  it  is  necessary  that 
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From  Eq  (3-4)  we  get 

3  H/  3  u  =  -Vp[-Xxsin  u  +  Ay  cos  u)  (3-7) 

In  order  to  satisfy  the  first-half  of  Eq  (3-5)  and  at  the 
same  time  minimize  H  we  must  have 

cos  u*  =  Xx/(x|+X2)1/2,  sin  u*  =  Xy /(x|-,^)1/2  (3-8) 

which  gives 

DH/3U  ~  -VpI-AXAy/(A^+X^)  ^ 

+  XxXyAx^+xl)1/2]  =  o  (3-9) 

The  first-half  of  Eq  (3-6)  becomes 

32H/3u2  =  +Vp[xxcos  u*  +  XySin  u*] 

=  Vp[X2/(X2+x2)1'/2+X2/(X2  +  X2)1/2]>0  (3-10) 

In  a  similar  manner  it  can  be  shown  that  in  order  to 
maximize  H  with  respect  to  the  evader's  control  v  we  must 
have 

cos  v*  =  xx/(x|  +  x2)1/2;  sin  v*  =  Xy/ (x|+x2)  J/2  (3-11) 

This  will  satisfy  the  second-half  of  Eqs  (3-5)  and  (3-6)  as 
shown  here 

3H/3v  =  Ve[-Xxsin  v*  +  XyCos  v*] 

=  VeI-XxXy/(XX+X^)1/2+XxXy/(x|+X2.)1/2]  =  0  (3-12) 
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=  Ve[-^xc°s  v*  -  Aysin  v*] 

-  <  0  (3-13) 

Also  by  using  the  controls  in  Eqs  (3-8)  and  (3-11)  we  see 
that 

H*  =  M«  Mfin  ii  =  Mjn  M$x  H  (3-14) 

which  must  be  satisfied.  Thus  Eqs  (3-8)  and  (3-11)  repre¬ 
sent  the  optiiral  open-loop  control  law's  for  this  problem. 


The  costate  equations  are 

Ax  =  -3U/3X  =  0  (3-15) 

Ay  =  -3H/3 y  =  0  (3-16) 

The  transversality  conditions  give 

lx(tf)  -  x(tf)  (3-17) 

ly(tf)  =  y(tf)  (3-18) 


Open-loop  Solution.-  Now,  if  we  substitute  the  controls 
(u*,v*)  into  the  equation  of  motion,  we  get  the  reference 
differential  equations  for  the  open- loop  solution  ro  the 
problem.-  The  state  equation  and  boundary  conditions  are 

x  =  (Ve-Vp)Ax/(A“+A|)1/'':,  x(o)  given  (3-19) 

y  =  (ve"vp)Ay/fAx+AyJ1^2»  )'(°)  given  (3-20) 

and  the  costate  equations  and  transversality  conditions  are 
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—  Xy  S  0 

C3-21) 

=  xCtf) 

(3-22) 

xyCtf)  =  y(tf) 

(3-23) 

By  making  use  of  the  neighboring  extremal  path  develop¬ 
ment  of  Chapter  II,  we  find  that  Eq  (2-22)  reduces  to 

4X(t)  =  SCt)  6x(t)  (3-24) 


with  the  matrix  Riccati  equation  (2-16)  becoming 


S  =  S  B  S,  S(tf) 


'1  O' 
0  1 


where 


B(t)  =  -(Ve-Vp  /(a2  +  ,\2)1/2 


r>? 


-XvX 


xAy 


xl 


(3-25) 


(3-26) 


Closed-Loop  Solution.  For  this  problem,  the  costate 
differential  equations  can  easily  be  integrated  and  the 
closed-loop  solution  found.:  Equations  (5-21)  through  (3-23) 
give 


Xx  =  constant  =  x(tf)  (3-27) 

Xy  =  constant  =  y(tf)  (3-28) 

which  when  substituted  in  Eqs  (3-19)  and  (3-20)  implies  that 


x(t)/y(t)  =  x(t£j/y(tf) 


Therefore  the  optimal  closed-loop  control  laws  may  be  written 
as 
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cos  u*  =  cos  v*  =  x(t)/(x2(t)+y2(t))*/2 

(3-29) 

sir  u*  =  sin  v*  =  y  (t) /(x2  ( t) +y2  (t) )  2/ 2 

Substituting  (3-29)  into  (3-2)  and  (.3-3)  gives  the  optimal 
■'losed-loop  cifferential  equations  of  motion.  But  if  tile 
evader  decided  to  play  non-optimally  while  the  pursuer 
played  his  optimal  strategy  the  close  i-loop  state  equations 
of  motion  would  be 

x  =  Ve  cos  v  -  VpX/(x2+y2) 1/2  (3- 30) 

y  =  Ve  sin  v  -  Vpy/(x2+y2) 2/2  (3-31) 

These  equations,  when  integrated  forward  from  t0  to  tf,  gave 
the  actual  trajectory  which  served  as  -a  basis  with  which 
to  c — nare  the  results  obtained  from  the  proposed  method 
for  generating  the  closed-loop  solutions  for  this  problem. 

Program  Algorithm 

Based  on  the  above  development  of  the  open- loop  solu¬ 
tion,  a  computer  program  was  written  using  the  neighboring 
extremal  path  approach  in  an  effort  to  arrive  at  the  closed- 
loop  solution  for  this  problem.  The  program  followed  the 
algorithm  outlined  in  Chapter  II  and  depicted  in  Fig  1.  All 
integrations  were  done  using  a  variable  step  fourth  order 
Runge-Kutta  method.. 

(1)  To  carry  out  step  (a)  of  the  neighboring 
extremal  algorithm,  the  initial  conditions  and  input  data 
for  this  problem  were 
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x(t0)  k  10.  rr  B  constant  ■  2. 

y(t0)  “0.  Vc  »  constant  -  1. 

1 0  a  0 .  t  f  n  5 . 

If  both  players  play  their  optimal  strategy,  the  reference 
open-loop  solution  to  the  TPHVP  gives 

x(tf)  »  x(5.)  B  S. 

y (tf)  -  y(s.)  -  o. 

which  according  to  l;qs  (5-21)  through  (5-25)  gives  the 
following  costates 


Xx(t)  «  constant  »  x(tf)  *  5. 

Xy ( t)  «  constant  *  y(tf)  ■  0. 

In  our  aim  of  do  termini ng  control  laws  based  on  the 
current  state  and  time,  we  will  assume  that  the  evader 
decides  to  play  a  non-optimal  constant  strategy  of  v  *  90° 
as  opposed  to  the  optimal  strategy  which  for  this  problem 
is  v*  *=  0°  according  to  Uq  (5-11).  Therefore  the  "real" 
equations  of  motion  are  determined  from  Hqs  (5-2)  and  (5-5) 
to  be 

X  a  -Vp  X\/(Xi+Ay) 

(5-52 

y  »  Vc  -  Vp.\v/(\x+A<:)l/- 
*  *  • 

Integrating  these  equations  forward  from  tG  to  tj  gives  the 
curve  from  point  1  to  4  in  l;ig  l. 
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(.2)  Now  follow  the  procedures  laid  out  in  steps 
(b)  through,  (g)  of  Chapter  II.  Thus  we  generate  a  closed- 
loop  trajei tory  for  a  continuously  updated  TPBVP.  The 
smaller  the  size  of  the  sampling  interval  and  integration 
step  the  closer  we  should  be  to  the  actual  closed-loop 
trajectory. 

Results  and  l.ialysis 

The  results  from  this  problem  are  presented  in  Table 
I,  page  24,  and  Fig  3,  page  25.  Two  runs  were  made  using 
the  actual  closed-loop  solution  with  different  integration 
step  sizes.  The  resulting  costs  were  the  same  for  both 
runs.  It  is  approximately  6 i  lower  than  the  costs  obtained 
from  the  near-optimal  solution  for  the  same  sampling  step 
size  of  0.5. 

From  the  data,  we  see  that  for  a  specific  sampling 
step  size  there  is  no  change  in  the  final  cost  as  a  result 
of  different  integration  step  sizes  being  used  for  the 
reference,  matrix  Riccati  and  real  differential  eolations. 
This  indicates  that  one  can  decrease  the  integration  time 
by  selecting  a  relatively  coarse  step  size  of  .01  and  yet 
not  change  the  cost. 

The  slope  of  the  curve  for  the  near-optimal  solution 
seems  to  indicate  that  as  the  sampling  step  size  approaches 
zero,  a  limiting  minimum  cost  is  approached.  This  is  in 
agreement  with  one's  intuition  regarding  the  sampling  inter 
val  and  cost. 
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These  runs  were  made  without  enforcing  the  transvers- 
ality  conditions.  It  would  be  interesting  to  compare  this 
data  with  that  obtained  by  enforcing  the  transversality 
conditions This  feature  was  examined' quite  extensively 
in  Chapter  IV. 
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integrations.  nSe'Kutta  method  was  used 
the  matrix  Riccati  equations. 
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XV.  Pursuit-Evasion  Differential 
Game -Rocket  Problem,  Fixed  F;nal  Time 

Statement  of  the  Problem 

The  formulation  of  this  problem  is  similar  to  that  of 
Isaacs  (Ref  1  105) .  But  in  order  to  have  a  slightly  more 
non-linear  problem,  the  drag  will  vary  as  a  function  of  the 
velocity  squared,  whereas  Isaacs  represents  it  as  a  linear 
function  of  the  velocity.  The  pursuer  is  driven  by  a  fixed 
thrust  magnitude  F,  but  the  direction  is  controlled  by  4. 

The  evader  has  simple  motion  with  fixed  speed,  W.  The 
action  takes  place  in  a  plane  and  the  payoff  is  one-half 
the  square  of  the  distance  between  the  players  at  the  end 
of  a  fixed  final  time. 

The  pursuer  will  be  burdened  with  a  friction  drag 
proportional  to  the  negative  of  his  velocity  squared. 

Without  the  drag  there  is  no  bound  on  the  pursuer's  speed. 

If  the  friction  force  is  -k  times  the  speed  squared,  there 
is  a  natural  limit  to  the  latter  equal  to  F/k.;  It  is  the 
square  of  the  speed  the  pursuer  would  come  to  asymptotically 
if  his  thrust  propelled  him  along  a  straight  line. 

We  will  use  a  moving  relative  coordinate  system  centered 
on  the  pursuer;'  see  Fig  4,  page  34.  The  object  is  to  find 
a  saddle  point  of 

J(tf)  =  l/2(x2+y2)|t  =  (4-1) 
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subject  to 


x  =  xe-Xp  =  W  sin4  -  u 

(4-2) 

y  ~  Ye~Yp  ~  w  cos0  *  v 

(4-3) 

u  =  F  sin<j>  -  ku2 

(4-4) 

v  =  F  cos 4  -  kv2 

(4-5) 

where  u  and  v  are  the  respective  x  and  y  components  of  the 
pursuer's  velocity.  In  this  problem,  as  in  the  one  in 
Chapter  III,  there  are  no  terminal  constraints  or  control 
constraints.  The  pursuer's  control  is  4  and  the  evader's 
control  is  8. 

Necessary  Conditions 

Applying  the  necessary  conditions  for  a  saddle  point 
solution,  the  Hamiltonian  H  is  given  by 

H  =  AX[W  sinO-u]  +  Ay  [IV  cose-v] 

+  AU[F  sin<j>-ku2]  +  AV[F  cosi|>-kv2]  (4-6) 

The  Hamiltonian  must  be  minimized  with  respect  to  the  pur¬ 
suer's  control  4  and  maximized  with  respect  to  the  evader's 
control  6.  Therefore  it  is  necessary  that 

3H/34  =  0  ,  3H/30  =  0  (4-7) 

32H/342  >  0  ,32II/302  <.  0  (4-8) 

Applying  these  conditions  we  get 

3H/34  =  F[AU  cos4  -  Av  sin4]  (4-9) 


27 


GA/MC/72-4 


In  order  to  satisfy  (4-7)  and  at  the  same  time  minimize  H 
we  must  therefore  have 

sin**  »  -i-a/C''u+,lv)1^2»  cos t*  =  -xv/(xu+xv)  (4-10) 

This  also  satisfies  the  first-half  of  Eq  (4-8). 

In  a  similar  manner  it  can  be  shewn  that  in  order  to 
maximize  II  with  respect  to  "the  evader's  control  e  we  must 
have 

sine*  =  lx/(*x+Ay)1|/2>  c°s£)*  ■  xy/(xx+xy) 1^2  (4-11) 

This  will  satisfy  the  second-half  of  Eqs  (4-7)  and  (<i-8). 
These  optimal  open-loop  control  laws  (Eqs  [4-10)  and  (4-11)) 


also  satisfy  Eq  (2-6).. 

The  costate  equations  are 

Xx  =  -3H/3x  =0  Xx  =  constant  (4-12) 

Xy  =  -3H/3y  =0  Xy  =  constant  (4-13) 

Xu  =  -3H/3u  *  Xx  +  2kxuu  (4-14) 

Xv  =  -3H/3v  =  Xy  +  2kXvv  (4-15) 

The  transversality  conditions  give 

xx(tf)  =  x(tf)  (4-16) 

Xy(tf)  =  y(tf)  (4-17) 

Xu(tf)  =  xv(tf)  =  0  (4-18) 


Substituting  the  controls  (**,e*)  into  the  equations  of 
motion  (4-2  through  4-5),  we  get  the  following  reference  dif¬ 
ferential  equations  and  boundary  conditions  for  the  open-loop 
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solution  to  the  problem. 

x  =  1//2  ’  u  >  x(°)  given  (4-19) 

y  =  WXy/CxI+x^)1/2  -  v  ,  y(o)-  given  (4-20) 

u  =  -FXu/(X^+A2)1/2  -  ku2  ,  u(o)  given  (4-21) 

v  =  -FXV/(A2+X2)1/2  -  kv2  ,  ’  (o)  given  (4-22) 

From  the  neighboring  extremal  path  developments  of 
Chapter  II,  we  find  that  the  matrix  Riccati  equation  for 
this  problem  is 


$  =  -SA  -  AtS  +  SBS  -  C 

(4-23) 

R  =  -(AT  -  SB) R 

(4-24) 

Q  =  RtBR 

(4-25) 

the  boundary  conditions  determined  from  Eqs  (2-19)  through 
(2-21)  give 


S(tf)  = 


10  0  0 
0  10  0 
0  0  0  0 
0  0  0  0 


R(tf)  =  (0) 


(4-26) 


(4-27) 


since  there  are  no  terminal  constraints  on  the  problem,  and 
Q(tf)  =  (0)  (4-28) 
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The  last  two  boundary  cond  .ions  force  Eqs  (4-24)  and  (4-25) 
to  be  equal  to  zero,  therefore  R(t)  and  Q(t)  are  constants, 
and  equal  to  zero  at  all  times.  The  (4x4)  coefficient  ' 
matrices  in  Eq  (4-23)  are  from  the  general  linearized  stai.e 
and  costate  equations  (2-11)  and  (2-12) For  this  problem 


A(t)  = 


-1  0 
0  -1 
-2ku  0 
0  -2kv 


(4-29) 


B(t)  =  - 


•  Wj2 

(a£+x2)3/2 

-WAxAy 

(a|+x2)3/2 


_  W  A  ^  A  y 

wa£ _ 

(a2+X2)3/2 


-fa£ 


*(a2+a2)3/2 

rxuxv 


(*$+**) 


3/2 


FXu^v 


(A  2  + A  2)^71 

-FAg 


(A^x2)3/2 


(4-30) 


C(t)  =  - 


0 

0 

2kA„ 


0 

0 

0 

2kA„ 


(4-31) 


Equation  (2-22)  reduces  to 


4 A (t)  =  S(t)  <5x(t) 


(4-32) 
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which  reflects  the  effect  on  the  costates  due  to  changes 
in  the  stale  variables. 

Program  Algorithm 

Just  ar  in  the  Simple-Motion  problem  of  Chapter  III, 
we  want  to  start  with  the  given  initial  conditions.  Then 
solve  the  TP3VP  to  obtain -a  reference  trajectory.-  The 
availability  of  this  reference  trajectory  is  a  necessary 
feature  of  this  method  for  generating  a  closed-loop  solu¬ 
tion  to  differential  games.;  However,  this  algorithm  by¬ 
passed  solving  the  TPBVP  and  instead  used  a  backward  inte¬ 
gration  as  a  means  of  generating  an  optimal  open- loop  solu¬ 
tion.; 

(1)  As  before,  a  fixed  final  time  of  tf  =  5.0 
was  assumed  along  with  the  following  input  data 


x(tf)  =  1, 

W  = 

1. 

y(tf)  =  2. 

F  = 

2.; 

U(tf)  =  2. 

k  = 

.1 

v(tf)  =  4. 

'  = 

0.. 

Basically  we  are  starting  the  program  from,  step  (b)  of  the 
algorithm  in  Chapter  II..  Therefore,  the  transversal! ty 
conditions  Eqs  (4-163  to  (4-18)  may  be  wiitten  as 

^x(bf)  ~  2.  ^u^bf)  =  0* 

^y(tf)  =  2.  ^V^f)  ~  0  *; 

Having  both  Au ( t £)  and  lv(tf)  equal  to  zero  presents 
a  problem  when  evaluating  u(tf)  and  v(tf)  from  Eqs  (4-21) 


31 


GA/MC/72-4 


and  (4-22).  To  avoid  dividing  by  zero,  we  apply  L'Hospital's 
rule  to  the  terms  that  would  have  zero  in  the  denominator 
and  we  obtair  the  following  expressions  for  Xu  and  Xv  as 
functions  of  and  >.y. 

Xu  =  Xxfit 
=  X  ^fi  d 

where  fit  =  tf  -  tf 

Therefore 

VOu+4)1/2  =  Xx/C^x^y)1/2 

and 

xv/(*u+xv)i/2  =  V(x^+xy)1/2 

(2)  Now  integrate  the  reference  state,  costate 
and  matrix  Riccati  equations  backwards  from  tf  to  t0.  This 
gives  us  the  reference  cpen-loop  trajectory  represented  in 
Fig  1  by  the  curve  fron  point  3  to  1..  We  also  have  the 
reference  values  of  the  states  and  S(t)  at  the  sampling  time 
tx  which  will  be  needed  later  to  compute  6 x Ctx) . 

(3)  Kith  the  initial  conditions  for  the  problem 
now  specified,  compute  the  "real"  trajectory  (from  point  1 
to  4)  by  substituting  the  actual  strategies  of  the  players 
^nto  the  differential  equations  of  motion  (lias  4-2  through 
4-5)  and  integrate  forward  from  tc  to  tj^  the  sa"pling  time. 
For  this  problem  v.c  assumed  the  pursuer  played  his  optimal 
strategy  (l.q  4-10)  while  the  evader  played  the  constant 
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non-optimal  control  of  6  =  0°.  Using  C41* » e)  we  get  the 
following  "retl"  equations  of  motion 


x  =  -u 

(4-33) 

y  =  W-v 

(4-34) 

u  =  -FXu/(X2+X2)l/2 

-  ku2 

(4-35) 

v  =  -Fxv/C*2+x2)1/2 

-  kv2 

(4-36) 

(4)  N'ow  follow  the  same  procedures  laid  out  in 
steps  (d)  through  (g)  of  Chapter  II.  The  only  thing  that 
is  different  is  the  number  of  differential  equations  in¬ 
volved.  In  this  problem  there  are  twenty-four  differential 
equations  (4  state,  4  costate,  and  16  matrix  Riccati)  which 
are  integrated  backwards  in  step  2  as  opposed  to  four  for 
the  Simple-Motion  problem. 

Results  and  Analysis 

Eased  on  the  input  data,  if  both  players  played  optimally, 
the  cost  J(tf)  would  be  equal  to  2.5.  But,  we  assumed  the 
evader  did  not  play  optimally.  Therefore  for  a  zero-sum 
differential  game  the  pursuer  should  gain  what  the  evader 
loses.  Ke  would  expect  the  cost  at  tf  to  be  less  than  2.5. 

The  results,  as  shown  in  Table  II,  page  37,  and  Fig  5,  page 
38,  do  not  completely  bear  this  out.  Ke  see  that  for  values 
of  At  less  than  .2  the  cost  shoots  up  instead  of  approaching 
some  minimum  optimal  cost.:  This  is  due  to  the  fact  that 
both  players  are  actually  playing  nonoptimally .  The  algorithm 
provides  the  pursuer  with  a  "near"  optimal  strategy  as  opposed 
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Figure  4.  Two-Dimensional  Rocket  Problem 

to  the  optimal  strategy.;  In  other  runs  it  was  found  that 
integration  step  sizes  equal  to  or  greater  than  .01  pro¬ 
vided  too  coarse  of  an  integr.  tion  to  produce  useful  results 
In  runs  number  nine  and  ten,  the  results  obtained  by 
using  a  simple  predict-correct  integration  method  were 
compared  with  that  from  the  variable  step  fourth  order  Rungo 
Kutta  method.  It  was  felt  that  the  resulting  costs  were 
close  enough  to  justify  using  the  method  which  was  easiest 
to  program  although  it  provided  a  much  finer  integration 
than  was  necessary  and  caused  the  program  to  take  longer 
to  run.  Therefore,  the  Runge-Kulta  integration  method  was 
used  for  all  subsequent  runs. 
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A  set  of  runs  was  used  to  study  the  effect  of  enforcing 
the  transversality  conditions.  The  first  runs  were  made 
without  enfoicing  the  transversality  conditions.  They  are 
the  runs  in  Table  II  w'ith  no  asterisk  attached  to  the  final 
cost.;  For  these  runs,  the  final  values  of  the  costates 
obtained  from  the  forward  integration  in  step  a  were  used 
in  the  next  cycle  as  starting  values  in  step  c. 

For  the  next  set  of  runs  [those  with  a  single  asterisk 
on  the  final  cost  in  Table  II)  the  transversality  conditions 
were  enforced.  That  is  in  step  b  . 

*x(lf)  =  *Ctf) 

Ay(tf)  =  y(tf) 

For  the  final  set  of  runs,  the  following  substitution: 
were  made  in  step  b  to  obtain  a  modified  enforced  trans¬ 
versality  condition  which  would  be  used  in  step  c  of  the 
next  cycle.; 

*x(t£)  =  1/Z[x(tf)+Ax(tf)] 

*y(tf)  =  l/2[y(tf)TAy(tr)] 

As  shown  in  Table  II  and  Fig  5,  the  results  from  all 
three  sets  of  runs  were  very  close.;  For  this  problem, 
enforcing  the  transversality  conditions  appears  to  have  had 
negligible  influence  on  the  final  cost.-  This  may  have  boon 
due  to  the  relatively  small  value  of  tp.  Had  it  been  as¬ 
sumed  to  be  much  greater  than  5.0,  a  difference  in  the  costs 
may  have  been  detected.- 
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According  to  Fig  5,  there  is  a  relatively  large  range 
of  sampling  step  size  which  provides  a  final  cost  very 
near  the  minimum  cost.  Therefore,  by  choosing  a  sampling 
step  size  At  in  the  range  of  1.0  we  can  get  a  very  near 
optimal  closed-loop  solution. 
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Table  II 

Results  of  Rocket,  Fixed  Final  Time  Problem 


Run 

No. 

Samp] ing 
Step  Size 
At 

Integration 
Step  Size 

Final  Cost 

J(tf)  =  l/2(x2+y2)  t 
=  2.5  (Input  Data)  1 

1 

.05 

.001 

2.8370 

2 

.05 

.001 

2.6732** 

3 

.1 

.0D1 

2.2752 

4 

.1 

.001 

2.2336** 

5 

.2 

.0005 

1.9933 

6 

.2 

.001 

2.0796 

7 

.2 

.001 

2.0663* 

8 

.2 

.001 

2.0681--* 

9a 

.5 

.0005 

1.9459 

10 

.5 

.0005 

1.9621 

11 

.5 

.001 

1.9958 

12 

.5 

.001 

1.9917* 

13 

.5 

.001 

1.9944** 

14 

1.0 

.0005 

1.9748 

15 

1.0 

.001 

1.9940 

16 

1.0 

.001 

2.0025* 

17 

1.0 

.001 

2.0023** 

18 

1.25 

.001 

2.0057 

19 

1.25 

.001 

2.0257* 

20 

1.25 

.001 

2.0193** 

21 

2.5 

.001 

2.1463 

a:  For  this  run  a  simple  predict-correct  integration 
scheme  was  used.  All  other  runs  used  a  fourth 
order  Runge-Kutta  integration  method. 

*:•  Transversality  conditions  enforced. 

*:  Modified  transversality  conditions  enforced. 


Rocket  Problem,  Cost  Vs.  Sampling  Step  Size 
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V.  Pursuit-Evasion  Differential  Game-Simple  Motion 
Problem,  Free  Final  Time  with  Terminal  Constraint 


Statement  of  the  Problem 

The  basic  problem  is  the  same  as  that  of  Chapter  III, 
the  pursuer  and  evader  have  constant  velocity,  unrestricted 
simple  plainer  motion,  with  the  pursuer  having  the  speed 
advantage.  The  game  will  be  over  not  at  a  fixed  final  time 
but  when  the  terminal  constraint  is  satisfied.  That  is 
when 


4>[x(tf)j  =  1/2  [x2(tf)  +  y 2  C  tf  5  ]  - 1/2  =  0  (5-1) 


This  may  be  pictured  as  a  capture  circle  with  the  center  at 
the  pursuer's  position.-  The  terminal  constraint  \p  will  be 
satisfied  if  the  evader  is  forced  inside  this  unit  circle. 

The  object  of  the  game  for  the,  pursuer  is  to  accomplish  this 
in  the  minimum  time  possible.  The  evader,  when  playing  his 
optimal  strategy,  aims  to  prevent  capture  or  at  least  delay 
it  as  long  as  possible.  In  other  words,  we  want  to  determine 
the  saddle  point  of 


J(tf) 


*•0 


(5-2) 


which  means  we  have  the  minimum  time  to  capture.  The  game 
is  subject  to  the  same  relative  differential  equations  of 
motion 


x  =  Vc  cos  v  -  Vp  cos  u  (5-3) 

y  =  Ve  sin  v  -  Vp  sin  u  (5-4) 
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Necessary  Conditions 

Applying  the  necessary  conditions  for  a  saddle  point 
solution,  the  Hamiltonian  H  may  be  written  from  Eq  (2-6)  as 

H  =  l+l::[Ve  cos  v-Vp  cos  u]+Ay[Ve  sin  v-Vp  sin  u]  (S-S) 

It  can  be  seen,  that  as  far  as  the  controls  (u,v)  are  con¬ 
cerned,  H  for  this  problem  is  the  same  as  Eq  (3-4),  there¬ 
fore  the  same  (u*,v*)  will  provide  a  saddle  point  solution 


for  both  problems.-  These  controls  are 

sin  u*  =  Ay/(Ax+*y)1'/2»  cos  u*  =  *x/(*x+*y) (5-6) 

sin  v*  =  Ay/(A|+A2)l/2,  cos  v*  =  Ax/(a|+a2)1/2  (5-7) 

The  costate  equations  are 

Ax  =  -3H/3X  =  0  (5-8) 

Ay  =  -  311/  3y  =  0  (5-9) 

and  the  transversality  conditions  (from  Eq  2-9)  give 

Ax(tf)  =  vx(tf)  (5-10) 

Ay(tf)  =  vy(tf)  (5-11) 

where  v  is  a  constant  Lagrange  multiplier. 


Now  substituting  (u*,v*)  in  Eqs  (5-3)  and  (5-4)  we  get 
the  reference  differential  equations  for  the  open- loop  solu¬ 
tion  to  the  problem. 

x  =  (Ve-Vp) AX/(AX+A2) 1/2  (5-12) 
y  =  (Ve'Vp) \y/ (Ax+Ay) (5-13) 
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We  see  from  the  extremal  path  development  in  Chapter 
II,  that  foi  the  case  of  unspecified  final  time  the  changes 
in  the  costates  i*(t)  are  a  function  of  6x(t),  dv  and  dtf. 
Equation  (2-30)  applies  here 

S Aft)  =  S(t)  4x(t)  +  R(t)  dv  +  m(t)dt£  (S-14) 

The  additionzl  necessary  condition  of  Eq  (2-24)  must  also 
be  satisfied.  For  this  problem  we  see  that 


4  =  v[l/2(x2+y2)-l/2] 


(S-1S) 


d*/dt  +  *  §y  -  vfcci  ♦  yy) 


(5-16) 


this  gives 


n(tf)  =  v(ve-VpjL6ax+y^y)/(x^+Ay)1/2)t=t£+  l  =  o  (5-17) 

Substitute  Eqs  (5-10)  and  (5-11)  into  (5-17)  and  we  get  the 
value  of  the  arbitrary  constant 


=  {(Vp-ve)[x2(tf)  -  y2(tf)]1/2J' 


(5-18) 


1/e  will  now  determine  the  terms  in  Eqs  (2-26)  to  (2-28) 


From  Eq  (5-15)  we  get 


[  324>/Dx2] 


3  2  4  3  2  4  *[ 

3  3X3y  I 


324  324 

,3x3y  J~Z 


(5-19) 


t  =  tf 


Using  Eq  (5-17)  we  get 
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paf  = 

bxJtf 


M 

ax 


and 


in 

L  ay 


v(Ve-Vp)/ (4^1/2 


LV  J 


(5-2Q; 


dfl/dt  =  3U/3t  +  [sn/sx.sn/sy] 


x 

ly  J 


t  =  tf 

=  vCVe-Vp)2|t^S-21) 


Equation  (5-1]  gives 


and 


'  a±‘ 

3X 

-1 

3!(l 

=  V  A 

y 

.  ay. 

d*/dt|tf  =  3  i|»/3t  +  [a<p/ax,aii‘/3y] 


(5-22) 


=  CVgVj)/v[X2(t£)+X2(tf)]1/2  (5-23) 

Using  the  above  development  and  substituting  into  Eqs 
(2-33)  to  (2-38)  we  get  the  following  differential  equations 
and  boundary  conditions  for  this  problem. 


S  =  SBS  ,  S(tf)  = 

n  0 

-0  1 J 

'x(tf)' 

ft  =  SBR  ,  R(tf)  = 

.yctf). 

(J  =  RTBR,  Q(tf)  = 

0 

(5-24) 

(5-25) 

(5-26) 


m  =  SBm  ,  m(tf)  =  (ffj^  (5-27) 

n  =  RTBm,  n(t£)  =  (ff  )t£  (5-28) 

;  =  mTBm,  o(tf)  =/£!•)  (5-29) 


GA/MC/72-4 


where 

B  (t)  =  -cve-vp)/(x|n|)1/2 

Problem  Algorithm 

As  mentioned  in  previous  chapter;,  all  real  problems 
start  with  some  given  initial  conditions.:  The  TPBVP  must 
then  be  solved  to  provide  the  needed  reference  open- loop 
trajectory  to  be  used  in  this  method  for  generating  closed 
loop  solutions..  However,  as  in  Chapter  III,  rather  than 
solve  the  TPBVP  given  specific  initial  conditions,  this 
algorithm  uses  a  backward  integration  as  a  means  of  gen¬ 
erating  a  reference  optimal  open-loop  solution. 

(1)  To  terminate  the  game  at  some  minimum  time 
tf,  the  terminal  constraint  must  be  satisfied..  Therefore 
we  will  assume  the  following  input  data.- 


tf  =  5.0 

to  = 

0. 

x (tf)  =  1.0 

Ve  - 

1.0 

y(tf)  =  0 

vp  = 

2.0 

This  allows  us  to  solve  for  the  terminal  conditions  -  f  the 
costate,  matrix  Riccati  and  auxiliary  differential  equations 
(2]  The  program  then  integrates  the  reference 
states,  costates,  matrix  Riccati  and  auxiliary  differential 
equations  backwards  from  tf  to  tj,  at  which  time  the  refer¬ 
ence  states  x(ti),  S ( t x )  R(ti)  ,  Q(tl),  m(tx)  ,  n(tj)  and 


-ixAy 


-AXAy 


(5-30) 
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a(tj3  are  stored.  The  backward  integration  is  then  con- 

A 

tinued  to  t0  in  order  to  determine  the  initial  state  and 
costate  values. 

(3)  With  this  accomplished,  integrate  the  "real" 
equations  of  motion  forward  from  t0  to  tq.  In  this  prob¬ 
lem  we  assume  the  same  real  equations  of  motion  as  used  in 


Chapter  III,  that  is 

i  =  -Vpix/(x|+ 1^1/2  (5- 31} 

y  =  Ve  -  VpXy/(x|+x2)i/2  (5.32) 

(4)  Now  compute  the  difference  between  the  "real" 
and  "reference"  states  at  tj. 

«x(tX)  =  XReaiCtq)  -  xRef .  (tj)  (5-33) 

«y(tx)  =  yReai(ti)  -  yRef .  Cti)  (5-34) 

(5)  There  is  now  enough  information  to  use  Eqs 


(2-39)  and  (2-40)  to  compute  dv  and  dtf,  which  then  allows 
us  to  solve  for  6X(td)  using  (2-43) 

(6)  Next,  compute  the  new  costates  and  tf  by  using 

x(ti)new  =  Ht!)old  +  «X(tj)  (5-35) 

tf  =  l£  old  +  dtf  (5-36) 

(7)  In  this  step,  integrate  the  reference  state 
and  costate  equations  forward  to  determine  the  states  at 
some  new  final  time.  Here  we  have  a  choice  between  two 
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approaches,  he  could  stop  the  forward  integration  at  the 
computed  tf,  or  we  could  stop  whenever  the  terminal  con¬ 
straint  is  satisfied,  that  is  whenever  ♦  =  0.  The  latter 
approach  was  used  for  this  problem. 

(8)  Now  having  determined  the  new  final  states, 
enforce  the  t ransversality  conditions  by  recomputing  the 
terminal  conditions  on  the  costate,  matrix  Riccati  and 
auxiliary  difference  equations.  Then  go  back  to  step  2 
and  repeat  the  cycle. 

Results  and  Analysis 

ihe  results  of  this  problem  are  presented  in  Table  III, 
page  47,  and  Fig  6,  page  48.  The  terminal  constraint  for 
this  free  final  time  problem  was  assumed  to  be  a  circle 
around  the  pursuer  of  radius  equal  to  one.-  This  was  suf¬ 
ficient  for  capture  to  occur  in  all  cases.;  It  has  not  been 
determined  just  how  snail  this  circle  could  be  and  still 
assure  capture.  This  would  be  good  to  know  in  a  dogfight 
situation,  where  the  minimum  radius  of  capture  may  repre¬ 
sent  the  minimum  firing  range  for  the  weapons  the  oursuer 
has  on  the  aircraft.  T  i  close  inside  this  minimum  range 
would  be  a  mistake. 

From  the  data  we  see  that  there  is  a  broad  range  of 
integration  step  sizes  for  a  specific  sampling  interval 
which  will  provide  a  fairly  uniform  final  cost.  Therefore 
we  could  use  the  larger  step  size  0.1  to  decrease  integration 
time  and  the  cost  would  change  by  less  than  5,  of  the  average 
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value  for  the  ease  where  the  sampling  interval  is  equal  to 


Three  rur  s  were  made  to  determine  the  effect  of  en¬ 
forcing  the  tt ansvcrsality  conditions.  Figure  5  shows  that 
for  sampling  step  sizes  of  0.2  and  0.5  there  appears  to  he 
very  little  effect.  But  for  a  sampling  interval  of  1.0 
there  is  definitely  a  reduction  in  final  cost  due  to  en¬ 
forcing  the  tr ansversalitv  conditions. 

Here  agaii,  as  in  Chapter  III,  the  slope  of  the  curve 
seems  to  indicate  that  as  the  sampling  step  size  approaches 
zero,  a  limiting  minimum  cost  is  approached. 
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Table  III 

Results  of  Simple  Motion,  Free  Final  Time  Problem 


Run  ^ 
No.  j 

Sampl ing 
Step  Site 

At 

Integration 
Step  Size 

• 

Computed  Final  Time 
Cost,  J(tf) 

=  5.  (Input  Data) 

1 

.2 

1001 

3.1590 

,2 

.001 

3,1880* 

2 

.2 

.005 

3.1650 

3 

.2 

.01 

3.1700 

4 

.2 

.1 

3.390 

S 

.5 

.001 

3.3370 

5* 

.5 

.001 

3.3490* 

'6 

.5 

.005 

3.3400 

7 

.5 

.01 

3.3500 

8 

-.5 

-.1 

3.5000 

9 

1.0 

.001 

3.7580 

9* 

1.0 

.001 

3.5940* 

10 

1.0 

,005 

3.7600 

11 

1.0 

.01 

3.7700 

12 

1.0 

,1 

3.9000 

Transversality  conditions  eniorced. 
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VI .  Pursuit-Evasion  Differential  Game-Rocket  Problem, 

Free  Final  Time  with  Terminal  Constraint 

Statement  of  the  Problem 

The  problem  is  basically  the  same  as  that  of  Chapter 
IV,  except  that  the  game  will  terminate  not  at  a  fixed 
final  time  but  when  the  terminal  constraint,  iji(tf)  is  sat¬ 
isfied.  The  terminal  constraint  will  oe 

.*i*(tf)]  =  l/2[x2(tf)  +  y2(tj)  ]  -1/2  =  0  (6-1) 

This  represents  a  unit  circle,  centered  on  the  pursuer,  in 
a  relative  coordinate  system..  Therefore,  the  game  will  be 
over  when  the  evader  is  forced  inside  this  unit  circle  or 
in  other  words  when  capture  occurs.  The  object  will  be  tc 
capture  in  minimum  time,  which  means  we  must  determine  the 


saddle  point  solution 

of 

ftf 

J(tf) 

=J  dt  =  t£  -  t0 

to 

(6-2) 

subject  to 

X 

=  W  sin0  -  u 

(6-3) 

y 

=  w  cose  -  v 

(6-4) 

u 

=  F  sin<J>  -  ku2 

(6-5) 

V 

=  F  cos$  -  kv2 

(6-6) 

where,  as  in  Chapter  IV,  u  and  v  are  the  respective  x  and 
y  components  of  the  pursuer's  velocity  and  $  and  o  are  the 
respective  pursuer's  and  evador's  controls. 
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Necessary  Conditions 

From  Eq  (2-6)  the  Hamiltonian  may  be  written  as 
H  =  1+Ax[lv  sine  -  u]  +  Ay[W  cose  -  v] 

+  AX[F  sin$  -  ku^]  +  AV[F  co:;$  -  kv^] 


(6-7) 


It  can  be  seen,  that  as  far  as  the  con  :rols  (4,6)  are  con¬ 
cerned,  H  for  this  problem  is  the  same  as  Eq  (4-6),  there¬ 
fore  the  same  ($*, 6*)  will  provide  a  saddle  point  solution 
for  both  problems.  These  controls  are 

sinf *  =  -AU/(*S+*$)1/2  »  cos**  =  -V(xu+*v)1/Z  (6‘8) 

sine*  =  ax/(a|+a2-)1/2  j  cose*  =  \y/(**+\y)1/2  (6-9) 

The  costate  equations  are  the  same  as  for  Chapter  IV, 

A  x  =  -3H/3X  =  0  .  (6-10) 

A^.  =  -3H/3y  =  0  (6-11) 

A'  =  -3H/3U  =  Ax  +  2kAuu  (6-12) 

Ay  =  -3H/3V  =  Ay  +  2kAvv  (6-13) 

and  using  Eq  (2-9) ,  the  transversality  conditions  are 


ixCtf)  =  vx(tf) 

Ay(tf)  =  vy(tf) 


(6-14) 

(6-15) 

(6-16) 


where  v  is  a  constant  I.agrangc  multiplier. 
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Using  the  optimal  controls  (4*,  6*)  in  the  equations 

\ 

of  motion,  we  get  the  same  reference  differential  equations 
for  the  open-loop  solution  as  in  Chapter  IV. 

x  =  WXx/(x|+x2)l/2  -  u  (6-17 

=  Wly/Cll+X^)1/2  -  V  (6-18 

1  =  -FXu/(x2+a2)1/2  -  ku2  (6-19 

v  =  -FAv/(X2+x2)l/2  .  kv2  (6-20 

Making  use  of  the  neighboring  extremal  path  develop¬ 
ment  of  Chapter  II,  we  see  that  for  the  case  of  unspecified 

final  time,  the  changes  in  the  costates  «x(t)  are  given 

by  Eq  (2-30]  as  a  function  of  6x(t) ,  dv  and  dtf.- 

SX(t)  =  S(t)  6x(t)  +  R(t)  dv  +  m(t)dtf  (6-21 

In  order  to  satisfy  the  additional  necessary  condition 
(Eq  2-24]  which  must  be  satisfied,  we  find  that 

*  =  v [1/2 (x2+y2) - 1/2]  (6-22 

d$/dt  =  3$/3t+(3?’/Sx)x+(3i’/3y)y+(3J/3u)u+(a$/3v)v 

=  [xx  +  yy]  (6-23] 


This  gives 


fi(tf)  =  v{x[KXx/(x2+x2)1/2  .  uj 


+  >'[W\y/(x2+x2)l/2  -  v]l  +1  =  0  (6-24 

Substitute  Eqs  (6-14)  and  (6-15)  into  (6-24)  and  we  get  the 
value  of  the  arbitrary  constant 
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v  =  -  {W(x2+y2)1^2  -  (ux+vy))'1  (6-2S) 

t  =  tf 

We  will  now  determine  the  terms  in  Eqs  (2-26)  to  (2-28). 


From  Eq  (6-22)  we  get 


324>/3x2  “  v 


10  0  0 

0-  1  0  C 

0  0  0  0 


0  0  0  0 


(6-26) 


Using  Eq  (6-24)  we  get 


~ax/()'x+"y)1/2  ‘  u 

KAy/(\|+52)l/2  -  v 


I 3Q/ 3x] ^ 


(6-27) 


dP/dt  I  tf  =  «{[KAa/(a|  +  ).2)1/2  -  u]2+[WAy/(A2  +  A2)1/2  -  V]2 

+  x[FXu/(a5+X2)1/2  +  ku2]+y[FAv/(A2+>2)1/2  +  kv2])|t 


(6-28) 


Equation  (6-1)  gives 


[d.jJ/dtj'I  =  v-1  ■ 


(6-29) 


!  '■  '■ 
f  v  ^ 


t  =  tf 
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and 

<Wdt|t£  =  v-^W^+A*)  1/2-  Ciux+VXy)  ]  |  t=tf  (6-30) 

Using  these  equations  along  with,  fiqs  (2-33)  to  (2-38) 
we  get  the  following  matrix  Riccati  and  auxiliary  differ¬ 
ential  equations  for  this  problem. 

$  =  -SA  -  AtS  *  SBS  -  C  , 

1  0  0  0 

0  10  0 

(6-31) 

0  0  0  0 

0  0  0  0 

J  tf 

(6-32) 

tf 


Q  =  RTBR,  Q(tf)  =  {0}  (6-33) 
m  =  -  (AT-SB)m,  m(tf)  =  (an  /3x)t£  (6-34) 
n  =  RTBm,  n(tf)  =  (dv,/dt)-tf  (6-35) 
a  =  mTBm,  a(tf)  =  (dn/dt)t£  (6-36) 


The  coefficient  matrices  of  Eqs  (2-11)  and  (2-12)  are 
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ACt)  - 


(6-37) 


B(t)  =  - 


(xl+x2)3/2 

-WXXXy 

a£n*>V2 


-WXXXy 

(X^x2)3/2 


Cx*+)2)  3/2 


(6-38) 


0  (x2+Xv2)3/2  (Xy+Ay)  3/2 

FXUXV _  ~FXu 

Cxu+Xv)  tx^x?)3/2 


(6-39) 


0  2  k  v 


Problem  Algorithm 

The  computer  program  fo this  unspecilied  final  time 
Rocket  problem  follows  the  same  steps  outlined  in  Chapter  V. 


The  program  input  data  was 
tf  =  5.0 
W  =  l.C 


F  =  2.0 


x(tf)  -  1.0 

y (t£)  =  o.o 

U(tf)  =  4.0 
V (tf)  =  0.0 


and  the  "real"  equations  of  motion  were  assumed  to  be  the 
same  as  those  used  in  Chapter  IV  for  the  fixed  final  time 


Rocket  problem.  They  were  assumed  to  be 
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X  =  “U 

v\' 

(0-40) 

y  =  W-v 

(6-41) 

u  =  -fvos+4)1/2 

-  ku2 

(6-42) 

v  =  -FXv/OS+^)i/2 

-  kv2 

(6-43) 

Results  and  Analysis 

The  results  from  this  free  final  time  problem  are  pre¬ 
sented  in  Table  IV,  page  56,  and  Fig  7,  page  57.  For  this 
data,  the  capture  circle  radius  R  equals  /!.,  It  was  found 
that  capture  would  not  occur  if  R  =  1.  Therefore  the 
minimum  radius  that  assures  capture  is  someplace  between 
the  two  values  but  it  was  not  specifically  deue^mined. 

The  computed  final  tine  tf  is  the  sum  of  tf  the  actual 
final  time  from  the  previous  iteracion  and  the  computed 
value  of  dtf  at  the  final  sampling  time.  The  actual  capture 
(or  final)  time  is  determined  by  integrating  the  "real" 
equations  of  motion  forward  from  the  last  sampling  time 
until  the  terminal  constraint  is  satisfied  (4>  -  0).  Except 
for  the  cases  where  the  sampling  step  size  equals  0.2,  the 
computed  final  time  appears  to  provide  an  optimistic  final 
cost  as  compared  to  the  actual  final  cost.  This  is  another 
example  of  the  computational  errors  introduced  by  the  larger 
sampling  step  sizes.-  he  see  that  as  the  sampling  step  size 
decreases  the  resulting  final  cost  also  decreases.; 

From  Fig  7,  we  can  see  that  by  enforcing  the  trans- 
versality  conditions  we  achieve  a  definite  reduction  in 
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Table  IV 

Results  of  Rocket,  Free  Final  Tine  Problem 


Run 

No. 

Sampling 
Step  Size 
At 

Integration 
Step  Site 

Computed 
Final  Time 

t£  =  tf*dtf 

Actual 

Final  Time 

O  =  0) 

1 

.2 

.001 

4.0934 

4.0830 

1* 

.2 

.001  ' 

4.1118 

4.1110 

2 

.001 

4.1898 

4.2020 

2* 

. 5 

.001 

4.1683 

4,1750 

3 

1.0 

.001 

4.2575 

4.5430 

3* 

1.0 

.001 

4.2074 

4.2840 

Trans' ersality  conditions  enforced 


final  cost,  especially  for  the  larger  sampling  step  sizes. 
Ke  are  in  essence  starting  with  ah  updated  two-point 
boundary- value  problem  each  tine  we  enforce  the  transverr- 
ality  conditions. 
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VII.  Conclusions  and  Recommendations 

There  are  many  questions  left  unanswered  by  this  brief 
attempt  at  applying  the  proposed  method  of  generating  near- 
optimal  closed-loop  solutions  to  a  few  example  differential 
game  problems.  But  some  general  observations  can  be  made. 
There  are  a  number  of  factors  that  influence  the  final  cost 
in  differential  games.  Among  them  are  the  sampling  step 
size,  the  size  of  the  capture  circle  in  the  unspecified 
final  time  problems  and  enforcing  the  transversality  con¬ 
ditions.  It  was  found  that  generally  the  final  cost  varied 
directly  with  the  sampling  step  size  and  inversely  with  the 
size  of  the  capture  circle.  Enforcing  the  transversality 
conditions  resulted  in  decreased  final  cost.  In  some  prob¬ 
lems  it  appears  as  though  the  integration  step  sizes  from 
0.1  to  0.001  had  very  little  effect  on  the  final  cost. 

It  appears  as  though  one  can  use  a  coarse  integration  and 
yet  not  affect  the  final  cost  significantly.  Therefore  it 
may  be  possible  by  using  a  hybrid  computer  to  approach 
"real  time"  closed-loop  solutions.  The  analog  computer 
would  provide  the  coarse  but  rapid  integration  of  the  dif¬ 
ferential  equations.  In  any  practical  application  of  this 
method,  we  would  want  to  update  the  TPBVP  as  often  as  pos¬ 
sible.;  But  the  minimum  sampling  step  size  at  is  limited  by 
the  time  required  to  perform  the  numerical  calculations 
needed  to  update  the  solution.;  During  the  updating  interval, 
the  players  must  base  their  strategies  on  the  "best" 
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available  information  (the  state  of  the  game  at  the  be¬ 
ginning  of  the  last  updated  interval)  as  opposed  to  the 
perfect  or  complete  information  based  on  present  states.- 
This  should  still  be  adequate  provided  the  states  of  the 
game  do  not  change  too  rapidly. 

Although  this  method  for  generating  near-optimal 
closed-loop  solutions  is  most  applicable  to  differential 
game  problems,  it  would  also  be  applicable  in  many  optimal 
control  problems.  One  cf  the  main  limitations  of  the 
method  is  that  we  must  have  the  solution  to  the  TPBVP.  At 
times,  solving  the  TPBVP  could  be  quite  an  accomplishment 
m  itself.  Also  all  previous  discussion  was  limited  to 
solutions  in  the  small.  Even  then,  we  did  not  begin  to 
examine  the  many  problems  available  through  various  possi¬ 
ble  combinations  of  final  time,  control  constraints  and 
terminal  constraints.  To  say  The  least,  the*  is  a  lot  more 
work  to  done  in  this  area.; 
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